The loss surfaces of neural networks with general activation functions

نویسندگان

چکیده

The loss surfaces of deep neural networks have been the subject several studies, theoretical and experimental, over last few years. One strand work considers complexity, in sense local optima, high dimensional random functions with aim informing how optimisation methods may perform such complicated settings. Prior Choromanska et al (2015) established a direct link between training multi-layer perceptron spherical multi-spin glass models under some very strong assumptions on network its data. In this work, we test validity approach by removing undesirable restriction to ReLU activation functions. doing so, chart new path through spin complexity calculations using supersymmetric Random Matrix Theory which prove useful other contexts. Our results shed light both strengths weaknesses context.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic Neural Networks with Monotonic Activation Functions

We propose a Laplace approximation that creates a stochastic unit from any smooth monotonic activation function, using only Gaussian noise. This paper investigates the application of this stochastic approximation in training a family of Restricted Boltzmann Machines (RBM) that are closely linked to Bregman divergences. This family, that we call exponential family RBM (Exp-RBM), is a subset of t...

متن کامل

Deep Neural Networks with Multistate Activation Functions

We propose multistate activation functions (MSAFs) for deep neural networks (DNNs). These MSAFs are new kinds of activation functions which are capable of representing more than two states, including the N-order MSAFs and the symmetrical MSAF. DNNs with these MSAFs can be trained via conventional Stochastic Gradient Descent (SGD) as well as mean-normalised SGD. We also discuss how these MSAFs p...

متن کامل

Recurrent neural networks with trainable amplitude of activation functions

An adaptive amplitude real time recurrent learning (AARTRL) algorithm for fully connected recurrent neural networks (RNNs) employed as nonlinear adaptive filters is proposed. Such an algorithm is beneficial when dealing with signals that have rich and unknown dynamical characteristics. Following the approach from, three different cases for the algorithm are considered; a common adaptive amplitu...

متن کامل

Complex-valued Neural Networks with Non-parametric Activation Functions

Complex-valued neural networks (CVNNs) are a powerful modeling tool for domains where data can be naturally interpreted in terms of complex numbers. However, several analytical properties of the complex domain (e.g., holomorphicity) make the design of CVNNs a more challenging task than their real counterpart. In this paper, we consider the problem of flexible activation functions (AFs) in the c...

متن کامل

Neural Networks with Smooth Adaptive Activation Functions for Regression

In Neural Networks (NN), Adaptive Activation Functions (AAF) have parameters that control the shapes of activation functions. These parameters are trained along with other parameters in the NN. AAFs have improved performance of Neural Networks (NN) in multiple classification tasks. In this paper, we propose and apply AAFs on feedforward NNs for regression tasks. We argue that applying AAFs in t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Statistical Mechanics: Theory and Experiment

سال: 2021

ISSN: ['1742-5468']

DOI: https://doi.org/10.1088/1742-5468/abfa1e